Another common distribution is the Normal or Gaussian distribution (a.k.a. the bell curve). Normally distributed random numbers are generated with the rnorm(n,mean,sd) function. Default parameter values are n=1, mean=0, sd=1.
Note that the distributions are not perfectly smooth. The reason is that our random sample is finite. As we draw more and more samples, the histogram will approach the theoretical distribution.
By adding random “noise” to deterministic vectors, we can simulate a real-life data set where the underlying “law” is \(y=x\).
x <-seq(0,10, length.out =101)y <- x +rnorm(length(x), sd=0.5)plot(x,y)
Getting the same random sequence every time
In some cases we want to get the same random sequence in every simulation, so that we can identify and correct errors. For that, we can set the seed of the random number generator to a fixed number.
In order to get the expected number of heads, we need to repeat the experiment many times and average over the outcomes. (See “loops” later.)
Suppose you gamble with a coin: You gain 1 TL if it comes heads, and lose 1 TL otherwise. You repeat the coin toss 5 times. What is your balance at every step of the game?
Our gain is +1 if heads, and -1 if tails. To simplify the accounting, let us sample from (-1,1) and get the cumulative sum.
Suppose we generate random number pairs \((x,y)\) within the square \(-1\leq x\leq 1\) and \(-1\leq y\leq 1\). Some of them fall inside the inscribed circle \(x^2 + y^2 \leq 1\).
options(repr.plot.width=4, repr.plot.height=4)x <-runif(100,-1,1)y <-runif(100,-1,1)# plot the random points:plot(x, y, asp=1)# plot the unit circle:t <-seq(0,2*pi, length.out=100)xx <-cos(t)yy <-sin(t)lines(xx,yy,lwd =3,col="red")
The area of the circle is \(\pi\) and the area of the square is 4, so the ratio of points inside the circle to the points inside the square gives an estimate of \(\pi/4\). So, we can approximate \(\pi\) as:
4*sum(x^2+ y^2<=1)/length(x)
[1] 3.24
With a new set of random hits, the estimate will differ:
x <-runif(100,-1,1)y <-runif(100,-1,1)4*sum(x^2+ y^2<1)/length(x)
[1] 3.16
Exercises
Write an R expression that simulates the outcome of the 6/49 Lottery (Sayısal Loto), where one draws 6 numbers from 1, 2, …, 49. Note that the same number cannot appear twice in one drawing.
Generate 1000 random numbers, drawn from the normal distribution with standard deviation 2, and another 1000 with standard deviation 0.5. Plot the histogram for each set of numbers. What can you say about the effect of the standard deviation?
Throw 10 coins and count the number of heads. Repeat this experiment ten times, and find the mean of the number of heads.
Throw 3 dice 10000 times. Plot the histogram of the outcomes (outcomes should be between 3 and 18).